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Abstract. The /i-calculus is a powerful tool for specifying and verifying 
transition systems, including those with both demonic (universal) and 
angelic (existential) choice; its quantitative generalisation qM/j. |17I29I9| 
extends that to probabilistic choice. 

We show here that for a finite-state system the logical interpretation of 
qMn, via fixed-points in a domain of real- valued functions into [0, 1], is 
equivalent to an operational interpretation given as a turn-based gam- 
bling game between two players. 

The equivalence sets qMfj. on a par with the standard ^-calculus, in that 
it too can benefit from a solid interface linking the logical and operational 
frameworks. 

The logical interpretation provides direct access to axioms, laws and 
meta-theorems. The operational, game- based interpretation aids the in- 
tuition and continues in the more general context to provide a surpris- 
ingly practical specification tool — meeting for example Vardi's challenge 
to "figure out the meaning of AFAXp" as a branching-time formula. 

A corollary of our proofs is an extension of Everett's singly-nested games 
result in the finite turn-based case: we prove well-definedness of the min- 
imax value, and existence of fixed memoriless strategies, for all qMfj, 
games/formulae, of arbitrary (including alternating) nesting structure. 



1 Introduction 

The standard /i-calculus, introduced by Kozen j^, extends Boolean dynamic 
program logic by the introduction of least (/x) and greatest {v) fixed-point opera- 
tors. Its proof system is applicable to both infinite and finite state spaces; recent 
results |17] have established a complete axiomatisation; and it can be specialised 
to temporal logic. Thus it has a simple semantics, and a proof theory. 

But its operational significance can be more elusive: general /Lt-calculus ex- 
pressions can be difficult to use, because in all but the simplest cases they are 
not easy on the intuition. Alternating fixed points can be especially intricate; 
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and even the more straightforward (ahernation-free) temporal subset has prop- 
erties (particularly "branching-time properties" ) that are notoriously difficult to 
specify, as Vardi points out (iH] , 

Stirling's "two-player-game" interpretation alleviates this problem by pro- 
viding an alternative and complementary operational view |43| . 

The quantitative modal /i-calculus acts over probabilistic transition systems, 
extending the standard axioms from Boolean- to real values; and it would benefit 
just as much from having two complementary interpretations. Our goal in this 
paper is to identify them, and to give the proof of their equivalence, over a finite 
state space: one interpretation (defined earlier 29 17^) generalises Kozen's by 
lifting it from the Booleans into the reals; the other (defined here) generalises 
Stirling's. 

Our principal contribution here is thus the definition of the Stirling-style but 
quantitative interpretation, and the proof of its equivalence to the Kozen-style 
quantitative interpretation.^ We show also that memoriless strategies suffice, for 
both interpretations, again when the state space is finite. 

The Kozen-style quantitative interpretation is based on our earlier exten- 
sion 1 191841 of Dijkstra/Hoare logic to probabilistic/demonic programs (corre- 
sponding to the V modality): it is a real- valued logic based on "greatest pre- 
expectations" of random variables, rather than weakest preconditions of predi- 
cates. It can express the specific "probability of achieving a postcondition," since 
the probability of an event is the expected value of its characteristic function,^ 
but it applies more generally to other cost-based properties besides. Although the 
specific approach, i.e. with its explicit probabilities — may be more intuitive, the 
extra generality of a full quantitative logic seems necessary for compositionality 

m 

Converting predicates "wholesale" from Boolean- to real-valued state func- 
tions — due originally to Kozen US| and extended by us to include demonic 
(universal) Pl] and angelic (existential) [22] nondeterminism — contrasts with 
probabilistic logics using "threshold functions" that mix Boolean and nu- 

meric arguments: the uniformity in our case means that standard Boolean iden- 
tities in branching-time temporal logic T suggest corresponding quantitative 
laws for us [30], and so we get a powerful collection of algebraic properties "for 
free." The logical "implies" relation between Booleans is replaced by the stan- 
dard "<" order on the reals; false and true become and 1; and fixed points 
are then associated with monotonic real- rather than Boolean-valued functions. 
The resulting arithmetic logic is applicable to a restricted class of real-valued 
functions, and we recall its definition in Sec. 13 

^ The quantitative Kozen interpretation has been given earlier |17I29I9| : we review it 
here. 

* See Sec. l5.5l for an example of this. 
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Our Stirling-style quantitative interpretation is operational, and is based on 
his earlier strategy-based game metaphor for the standard /i-calculus. In our 
richer context, however, we must distinguish nondeterministic choice — both 
demonic and angelic — from probabilistic choice: the former continues to be 
represented by the two players' strategies; but the latter is represented by the 
new feature that we make the players gamble. In Sec.01we set out the details. 

In Sec. El we give a worked example ^ of the full use of the quantitative aspects 
of the calculus, beyond simply calculating probabilities. 

The main mathematical result of this paper is given in Sec. El though much 
of the detail is placed in the appendices. 

Stirling showed that for non-probabilistic formulae the Boolean value of the 
Kozen interpretation corresponds to the existence of a winning strategy in his 
game interpretation. In our case, strategies in the game must become "optimal" 
rather than "winning" ; and the correspondence is now between a formula's value 
(since it denotes a real number, in the Kozen interpretation) and the expected 
winnings from the zero-sum gambling game (of the Stirling interpretation). Since 
the gambling game described by a formula is a "minimax,^^ we must show it to be 
well-defined (equal to the ^^maximin'^): in fact we show that both the minimax 
and the maximin of the game are equal to the Kozen-style denotation of the 
formula that generated it. 

We also prove that memoriless strategies suffice. 

Both proofs apply only to finite state spaces. 

The benefit of our proved equivalence is to set the quantitative /i-calculus on 
a par with standard /i-calculus in that a suitable form of "logical validity" cor- 
responds exactly to an operational interpretation. As with standard /i-calculus, 
a specifier can use the operational semantics to build his intuitions into a game, 
and can then use the general features of the logic — whose soundness has been 
proved relative to the logical^ semantics — to prove properties about the spe- 
cific application. For example, the sublinearity 34 of gM/i — the quantitative 
generalisation of the conjunctivity of standard modal algebras — has been used 
in its quantitative temporal subset qTL to prove a number of algebraic laws 
corresponding to those holding in standard branching-time temporal logic j3U) . 

Preliminary experiments have shown that the proof system is very effective 
for unravelling the intricacies of distributed protocols j39IHl| . Moreover it pro- 
vides an attractive proof framework for Markov decision processes |32lllj — and 
indeed many of the problems there have a succinct specification as /i-calculus 
formulae, as the example of Sec. illustrates. In "reachability-style problems" 
Ul, proof-theoretic methods based on the logic presented here have produced 
very direct arguments related to the abstraction of probabilities . and even 
more telling is that the logic is applicable even in infinite state spaces 0. All of 
which is to suggest that further exploration of qMfi will continue to be fruitful. 

^ using both PRISM [HI and Mathematica® 
® We also call this the denotational semantics. 
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In the following we shall assume generally that S" is a countable state space 
(though for the principal result we restrict to finiteness, in Sec. |^. If / is a 
function with domain X then by f.x we mean / applied to x, and f.x.y is {f.x).y 
where appropriate; functional composition is written with o, so that {fog).x = 
f.{g.x). We denote the set of discrete probability sw6-distributions over a set X 
by X: it is the set of functions from X into the real interval [0, 1] that sum to no 
more than one; and if A is a random variable with respect to some probability 
space, and S is some probability sub-distribution, we write A for the expected 
value of A with respect to SJ In the special case that ^ is in X and A is a 
bounded real-valued function on X , in fact A is equal to X)s-5 ^ 

2 Probabilistic transition systems and /x-calculus 

In this section we set out the logical language, together with some details about 
the probabilistic systems over which the formulae are to be interpreted. 
Formulae in the logic (in positive^ form) are constructed as follows: 

= X|A|(K)(/)|[K]0|0in02|0iU02|0i<]GO(/)2|(M^-0)|(i'^-0). 

— Variables X are of type S — > [0, 1], and are used for binding fixed points. 

— Terms A stand for fixed functions in ^ [0, 1]. 

— Terms K represent finite non-empty sets of probabilistic state-to-state tran- 
sitions in TZ.S (see below), with (•) and [•] forming respectively angelic- 
(existential-) and demonic (universal) modalities from them. 

— Terms G describe Boolean functions of S", used in O ("if") G [> ("else") style 

m 

It is well known that such formulae can be used to express complex path- 
properties of computational sequences. In this paper we interpret the formulae 
over sequences based on generalised probabilistic transitions^ in what we call 
TZ.S, the functions t in S ^ S$ where 5'$ is just the state space S with a special 
"payoff" state $ adjoined. Thus S$ is the set of sub-distributions over that, so 
that the elements t oi TZ.S give the probability of passage from initial s to final 
(proper) s' as t.s.s'; any deficit 1 — t.s.s' is interpreted as the probability of 
an immediate halt with payoff 

t.s.$/{l-J2t-S-s') (1) 

s':S 

Normal mathematical practice is to write J A dS, but that greatly confuses the roles 
of bound and free variables: it makes the distribution (measure) variable 5 in dS 
free in the expression; but in the analogous J f{x)dx of analysis, the independent 
variable x in da; is bound. 

* The restriction to the positive fragment is for the usual reason: that the interpre- 
tation of any expression {XX ■ <j)), constructed according to the given rules, should 
yield a monotone function of X. 

^ They correspond to the "game rounds" of Everett |ir)j . 
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The relational element 
transition shown, 



in 7?,. 5", with its deficit of 1/10, denotes the 
1/4 H--- 




in which the transition probabilities now sum to 

one. In particular, the probability of transition g Z. 1/4 _^ j . , 

to $ is 1/2, which makes the expected immedi- 
ate payoff equal to 1/2 x 0.80 = 2/5 as given 
explicitly in the relation. ~*" $0.80 

Since H,T are states, they may lead further: no matter where they lead, however, the 
expected reward of the subtrees rooted there cannot exceed one, and so our encoding 
ensures both that the transition probabilities from s sum to one exactly (since the 
probability of transition s $ is 1/2 = 1 — (1/44-1/4) by definition), and that the 
expected reward from this tree (rooted at s) cannot exceed one either (since the actual 
reward $0.80 is defined just so that 2/5 = 0.8 x 1/2 wiU hold). 

The tree does not continue on from the payoff state $0.80. 



More generally, a "normal" transition, i.e. with X^iP* = 1> can effectively be "a- 
discounted" by using the elements (s Si, s — ^ $) or (s st, s 2—3 

Fig. 1. Example of payoff-state encoding 



See Fig. n for an example. 

This formulation of the payoff — i.e. "pre-divided" by its probability of oc- 
currence — has three desirable properties. The first is that the probabilistically 
expected halt-and-payoff is just t.s.$, i.e. is given directly by t. The second prop- 
erty is that we can consider the probabilities of outcomes from s to sum to one 
exactly (rather than no more than one), since any deficit is "soaked up" in the 
probability of transit to payoff, which simplifies our operational interpretation. 

The third property is that transitions preserve one-boundedness in the follow- 
ing sense. Define the set of expectations 8S (over S) to be the set of one-bounded 
functions S [0, 1]. U A in £S gives a "post-expectation" A.s' expected to be 
realised at state s' after transition t, then the "pre-expectation" at s before 
transition t is 

t.s.$ + A , where the sub-distribution t.s under J is re- 
stricted to states in S proper. 

It is the expected value realised by making transition t from s to s' or possibly $, 
taking A.s' in the former case and in the latter. That this pre-expectation is 

10 ■!■£ t.s.s' is one, then t.s.$ must be zero, because elements of S$ sum to no more 

than one. In that case we define the actual — and expected — payoffs both to be 
zero. 

To avoid clutter we will assume this restriction where necessary in the sequel. 
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also one-bounded, i.e. is in £S, allows us to confine our work to the real interval 
[0; 1] throughout. 

Hence computation trees can be constructed by "pasting together" applica- 
tions of transitions toi^ij • ■ • drawn from TZ.S, with branches to $ being tips.^^ 
The probabilities attached to the individual steps then generate a distribution 
over computational paths (which is defined by the sigma-algebra of extensions 
of finite sequences, a well-known construction |12|). 

We use the relation < — "everywhere no more than" between expectations 
(thus replacing "imphes"): 

A<A' iflf (Vs:5 • A.s < A'.s) . 

In our interpretations we will use valuations in the usual way. Given a formula 
(j), a valuation V does four things: (i) it maps each A in to a fixed expectation in 
£S; (ii) it maps each K to a fixed, non-empty finite set of probabilistic transitions 
in TZ.S] (iii) it maps each G to a predicate over S; and (iv) it keeps track of 
the current instances of "unfoldings" of fixed points, by including mappings for 
bound variables X . (For notational economy, in (iv) we are allowing V to take 
over the role usually given to a separate "environment" parameter.) 

We make one simplification to our language, without compromising expres- 
sivity. Because the valuation V assigns finite sets to all occurrences of K, we 
can replace each modality (K)0 (resp. [K](f)) by an explicit maxjunct U[^.p^{k}0 
(resp. minjunct n|^,|^{k}(/()) of (symbols k denoting) transitions k in the set (de- 
noted by) K. We do this because our interpretations conveniently do not distin- 
guish between (K) or [K] when K is a singleton set. 

In the rest of this paper we shall therefore use the reduced language given by 

= X\A \ {k}0 I 01 n I U 02 I 01 < G O 02 I {px • 0) I {vX ■ 0) . 

We replace (ii) above in respect of V by: (ii') it maps each occurrence of {k} to 
a probabilistic transition in TZ.S. 

3 Denotational interpretation: 
qMfi generalises Kozen's logic 

In this section we recall how the quantitative logic for nondeterministic/probabilistic 
sequential programs jl9l34| — from which we inherit the use of expectations, and 
the semantic definition ||{k}0|| below — leads to a quantitative generalisation of 
Kozen's logical interpretation of /i-calculus, suitable for probabilistic transition 
systems. 

Let be a formula and V a valuation. We write ||0||v for its meaning, an ex- 
pectation in £S determined by the rules given in Fig.|21 Part of the contribution 
of our previous work 29 3(1] is summarised in the following lemma. 



We see below that tips are made by constant terms A as well. 
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1. 
2. 
3. 



||X||v = V.X 
||A||v = V.A 
||{k}<^l|v.s = 



V.k.sS + J Mv 



' V.k.s 

||(;/>'||v-s min ||(ji"||v-s ; and 
||(;/>'||v-s max ||<j!>"||v-s . 



4. 



||0'n0"||v.s 

||0'U0"||v.s 



5. 



10' <lGt>0"||v.s 



110'llv.s if (V.G.s) else ||(^"||v.s 



6. 



IKm^ ■ <^)IIv 



(Ifp^- ll<^l|v. 



) where by (Ifp a:: ■ exp) we mean the least 
fixed-point of the function {Xx ■ exp). 



7. 



\\{uX ■ <^)||v 



(gfpa; ■ ||(/)||v[ 



Note that in the valuation V[xi~^x]y the variable X is mapped to the expectation x. 



Fig. 2. Kozen-style denotational semantics for qMii 



Lemma 1. The quantitative logic qMfi is well-defined — For any <j) in the 
language, and valuation V, the interpretation \<f>\\i is a well-defined expectation 



Proof. Structural induction: arithmetic, that our formulae express only mono- 
tone functions, and that {£S, <) is a complete partial order. (Recall that £S is 
[0, l]-bounded.) 

4 Operational interpretation: 

qMfj, generalises Stirling's game 

In this section we give an alternative account of formulae (of the reduced 
language), in terms of a generalisation of Stirling's turn-based game The 
game is between two players, to whom we refer as Max and Min. As in Sec.|21 we 
assume a probabilistic transition system TZ.S and a valuation V. Play progresses 
through a sequence of game positions, each of which is either a pair {(f), s) where 
<;ii is a formula and s is a state in S, or a single (y) for some real- valued payoff y 
in [0, 1]. Following Stirling, we will use the idea of "colours" to handle repeated 
returns to a fixed point. 

A sequence of game positions is called a game path and is of the form 
(<^0i So), (<^i, si), . . . with (if finite) a payoff position (p„) at the end. The initial 
formula 4>o is the given </>, and Sq is an initial state in S. A move from position 
(0i, Si) to (0i+i, Si+i) or to (y) is specified by the rules of Fig. |21 

A game path is said to be valid if it can occur as a sequence according to 
the above rules. Note that along any game path at most one colour can appear 
infinitely often: 



in £S. 



Lemma 2. All valid game paths are either finite, terminating at some payoff 
(y), or infinite; if infinite, then exactly one colour appears infinitely often. 
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If the current game position is {(t>i, Si), then play proceeds as foUows: 

1. Free variables X do not occur in the game — their role is taken over by "colours" 
(see Cases EHHlbelow.)." 

2. If 0i is A then the game terminates in position (y) where y = V.A.Si. 

3. if (jji is {k}0 then the distribution V.k.Si is used to choose either a next state s' 
in 5 or possibly the payoff state $. If a state s' is chosen, then the next game 
position is s'); if $ is chosen, then the next position is (y), where y is the payoff 
V.k.s.$/(I — "^Zs'-s V.k.s.s'), and the game terminates. 

4. If is (j)' n 0" (resp. <j)' U (f>") then Min (resp. Max) chooses one of the minjuncts 
(maxjuncts): the next game position is (0,3^), where (j> is the chosen 'junct (j)' or 

5. If </>i is </>'<] G > 0", the next game position is {(j)' , Si) if V.G.Si holds, and otherwise 
it is {(f)", Si). 

6. If 4>i is (/iX • (p) then a fresh colour C is chosen and is bound to the formula 4>[x^c] 
for later use; the next game position is (C,Si).'' 

7. If is {vX ■ <j)), then a fresh colour C is chosen and bound as for /j..'^ 

8. If (jji is a colour C, then the next game position is {(j),Si), where <f is the formula 
bound previously to C. 



The game begins with a closed formula — refer Item0 above. 

Infinite games result in there being a single colour C that occurs infinitely often; finite 
games end in a payoff (y) for < y < 1. 

Fig. 3. Rules for playing probabilistic formula-game. 

" Free variables do play a role in our more detailed analysis later (Fig. |^ . 
This use of colours is taken from Stirling 1431 : in App.^we formalise the operations 
of choosing fresh colours and binding them to formulae. The colour device easy 
determination, later on, of which recursion operator actually "caused" an infinite 
path. 

The two kinds of fixed point are not distinguished at this stage: see Def.Q below. 
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Proof. Stirling 14^ - 

To complete the description of the game, one would normally give the win- 
ning/losing conditions. Here however we are operating over real- rather than 
Boolean values, and we speak of the "value" of the game. In the choices 0' U 0" 
(resp. (j)' n 4)") player Max (resp. Min) follows a strategy in which he tries to 
maximise (minimise) a real-valued "payoff" associated with the game,^^ defined 
as follows. 

Definition 1. Value of a path — The value of a path is determined by a fixed 
function Val defined by cases as follows: 

1. The path tt is finite, terminating in a game state (y); in this case the value 
Val.TT is y. 

2. The path tt is infinite and there is a colour C appearing infinitely often that 
was generated by a greatest fixed-point v; in this case Vain is 1. 

3. The path tt is infinite and there is a colour C appearing infinitely often that 
was generated by a least fixed-point fi; in this case Val.TT is 0. 

In Sec. 16. II we make precise this notion of "value of a game," and its interac- 
tion with strategies. 



5 Worked example: investing in the futures market 
5.1 Describing a game 

Typical properties of probabilistic systems are usually cost-based, and to illus- 
trate that we give an example involving money. Concerning general expected 
values, it lies strictly outside the scope of "plain" probabilistic temporal logic. 

An investor / has been given the right to make an investment in "futures," 
a fixed number of shares in a specific company that he can reserve on the 
first day of any month he chooses. Exactly one month later, the shares will be 
delivered and will collectively have a market value on that day — he can sell 
them then if he wishes. 

His problem is to decide when to make his reservation so that the subse- 
quent sale has maximum value. 

The details are as follows: 

1. The market value v of the shares is a whole number of dollars between $0 
and $10 inclusive; it has a probability p of going up by $1 in any month, 
and l—p of going down by $1 — but it remains within those bounds. The 
probability p represents short-term market uncertainty. 



In fact attributing the wins/losses to the two players makes it into a zero-sum game. 
At (+) in Sec. we discuss the related problem of maximising profit. 



10 



AK Mclver and CC Morgan 



2. Probability p itself varies month- by- month in steps of 0.1 between zero and 
one: when v is less than $5 the probability that p will rise is 2/3; when v is 
more than $5 the probability of p's falling is 2/3; and when v is $5 exactly 
the probability is 1/2 of going either way. The movement of p represents 
investors' knowledge of long-term "cyclic second-order" trends. 

3. There is a cap c on the value of v, initially $10, which has probability 1/2 of 
falling by $1 in any month; otherwise it remains where it is. (This modifies 
Item^ above.) The "falling cap" models the fact that the company is in a 
slow decline. 

4. If in a given month the investor does not reserve, then at the very next 
month he might find he is temporarily barred from doing so. But he cannot 
be barred two months consecutively. 

5. If he never reserves, then he never sells and his return is thus zero. 

If it were not for Item 13, the investor's strategy would be the obvious "wait 
until V > 9 A p ^ 1 — however long that takes — and make a reservation 
then." But the falling cap defeats that, effectively discounting the payoff as time 
passes. Below we consider more sophisticated strategies that take that into 
account. 

The situation is summed up by the transition system set out in Fig.H 

— During each month there are three purely probabilistic actions that occur, 
and their compounded effects determine a transition m, which we will call 
month in our formula (to come); 

— At the beginning of each month, the investor makes a maximising (angelic) 
choice of whether to reserve; but, if he does not, then 

— At the beginning of the next month, there is a minimising (demonic) choice 
of whether he is barred. 

The utility of our game interpretation in Sec. 01 is that we can easily use the 
intuition it provides to write a formula describing the above system. The state 
space is {v,p, c), and we use a transition 

m = v:={v + l)nc p® (w-l)UO; 

if v<5 then p:= (p+0.1) n 1 2/3© (p-O.l)UO 
elsif v> 5 then p: = {p-O.l) U 2/3© (p+0.1) n 1 
else p:= (p-O.l) UO 1/2© (p-HO.l) n 1 

fi; 

c:= (c-l)UO 1/2© c 

With the cap c fixed at $10 we know that from any state there is a non-zero proba- 
bility, however small, of reaching « > 9 A p = 1 eventually; but, with the Zero-One 
Law [r5 27 25 for probabilistic processes, that means in fact that v > 9 A p = 1 
will be reached eventually with probability one. So "waiting" would be the correct 
strategy, because when v > 9 A p = 1 finally occurs, an immediate reservation is 
guaranteed to pay $10 in a month's time. 
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Value V 
goes down 



Value V 
goes up 



I Transition m I ProbabiUtyp 
models one month's goes down 

I stock-market activity I 



Probability p 
goes up 



Cap c 
goes down 



Cap c 

is unchanged 



The thick arrow, "exploded" on the right, represents the eflFect of one month's stock- 
market activity: symbol © labels the probabilistic choices it entails. The share value 
V may rise or fall, according to p; probability p itself may rise or fall, according to 
long-term trends; and the capped value c of the stock may fall. 




This represents the non-deterministic choices available to the investor (maximising 
player) and the market (minimising player). Symbol U represents the investor's choice; 
symbol n represents the market's choice. 

(The probabilistic choices occur "within" the thick arrows.) 

Fig. 4. Futures trading on the stock market: example. 
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to capture the effect of the large arrows. 

We can then use our (reduced) logical language to describe the surrounding 
angelic and demonic choices, including the "loop back" (fixed point) which gives 
value zero (i.e. ^) if it never terminates. Using month to denote to, and a constant 
expectation Sold^® to denote the function v returning just the v component of 
the state, we would write our formula as 

Game = (^X • {month}Sold U {month}(X n {month}X)) . (2) 
5.2 Playing the game 

Using the game interpretation, we can generate a probabilistic tree from the 
transition system in Fig.0]by duplicating nodes with multiple incoming arcs (as 
in month), "unfolding" back-loops, and making minimising or maximising choices 
as they are encountered. In this example, at each unfolding both the investor 
I (making U choices) and the stock market M (making □ choices whether to 
impose a bar) need to choose between two ongoing branches — and their choice 
could be different each time they revisit their respective decision points. Each of 
J, M will be using a strategy. 

For example (recalling Footnote I15|l . the investor /'s strategy (maximising, 
he hopes) for dealing with the falling cap might be 

wait until the share value v (rising) meets the cap c (falling), 
and reserve then. 

Waiting for v to rise is a good idea, but when it has met the cap c there is clearly 
no point in waiting further. 

And M's strategy (minimising, the investor fears) might be 

bar the investor, if possible, whenever the shares' probability p , , 
of rising exceeds 1/2. ^ ' 

In general, let aj and aM be sequences (possibly infinite) of choices, like the 
above, that / and M might make. When they follow those sequences, the game- 
tree they generate determines a probability distribution over valid game paths 
|12|. Anticipating the next section, let |0]y^''^" denote that path distribution^'' 
as generated by / and M's choices. We can now describe J's actual payoff as an 
expected value 



with the understanding that the random variable in the integral's body yields 
zero if in fact there is no final state (because of an infinite path). 

The Sold function should have codomain [0, 1], but to avoid clutter we have not 
scaled it down here by dividing by 10. 

The distribution is described explicitly by Def. Eland Lem. Instill to come. 




V 



(5) 
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In some cases, the choices made by / and M can be memoriless, in the sense 
that in identical situations (identical values of v,p,c in this case, and the same 
position in the transition system) they will always make the same choice. Both 
^ and Q above are memoriless. 

Memoriless strategies are particularly important for the efficient computation 
of expected payoffs and in Sec. we show that they suffice for analysis of 
qMfi formulae when the state space is finite — that is, informally, the players 
gain no advantage by remembering where they have been. 



5.3 The value of the game 

As is usual in game theory, when the actual strategies of the two players are 
unknown, we define the value of the game to be the the minimax over all strategy 
sequences of the expected payoff — but it is well-defined only when it is the same 
as the maximin, i.e. when in the notation of Q we have 



Y.ai.GM = 



The equivalence proved in Thm. ^ to come, tells us that such games' values 
are indeed well-defined, and that although we use the game interpretation to 
write down the formulae, we can use the logical interpretation || Game\\i to reason 
about their values. Sometimes, as in this simple case, we can use the logical 
interpretation to calculate an approximation directly. 



Although the details of month are (deliberately) slightly messy, the structure 
of the overall formula Game has been chosen (also deliberately) to be fairly 
simple, and as such the fixed point can be approximated by iterating the 
function 

{\X-mov U m.{Xnm.X)) 

beginning from the constant "bottom" function that is zero everywhere on the 
state space. 

Carrying out that calculation^*' shows for example that if p is initially 0.5 
and the cap c is 10, then the optimal expected sale-value for the investor is 



initial share value: 
optimal expected sale: 






1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


4.16 


4.30 


4.55 


4.88 


5.24 


5.52 


6.00 


7.00 


8.00 


9.00 


9.50 



(6) 



Even when the share value is only "moderately high," we thus see there is nothing 
to be gained by waiting, since the cap is likely to drop. For low initial values, 
however, some benefit can be gained by delaying the reservation for a while. 

Adding e.g. "but reserve immediately if v has fallen for five months in a row" to 
Strategy @ would give it memory. 

It is in fact just a quantitative eventually in the temporal subset qTL of qM^ |3UI29| . 
We used Mathematica® for this example calculation, and the results were verified by 
Gethin Norman, at Birmingham University (UK), using the PRISM model checker 
|21| and MatLab®. The scripts are available online |41| . 
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By comparison, the investor's "seat-of-the-pants" strategy at Q gives a sig- 
nificantly lower expected return against "worst-case" play by M: 



initial share value: 
Strategy 's yield: 






1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


3.68 


3.79 


3.97 


4.17 


4.29 


4.17 


4.16 


4.65 


5.61 


6.78 


9.50 



From this we might guess that when v is at least $6 (and p, c are as given) it is 
better to "reserve now" (as suggests) than to follow Strategy (jSJ and wait. 



5.4 Winning the game 



Ideally we would like to be able to calculate both the value of the game and 
the strategies to realise it, e.g. in this example we would like to be able to 
offer "investment advice." Even better than knowing that Strategy Q can be 
improved, as we have just seen it can, is knowing how to improve it. 

In some cases, the logical interpretation can help by providing theorems that 
allow formulae to be simplified |3()j or abstracted, thus bringing an apparently 
difficult formula within the range of probabilistic model-checkers |37j . 

For formulae with a particularly simple structure, we might even be able to 
appeal to theorems — proved in the logic — which give maximising or minimising 
strategies directly. In the case of Game, we do have such a theorem 
paraphrased, it states in this case that the investor should 



make an immediate reservation just when the expected value of 

the stock in one month's time is at least as great as the expected (7) 

value of the whole game played from this point. 

Otherwise, he should wait. 

The expected value of the stock in one month's time is easily calculated: it 
is just m.v.{v,p, c), where v,p, c are taken from the state "now." (Note that the 
function v, as an argument of m, will instead take the w- value of the state in one 
month's time.) Tabulated for p = 0.5 and c = 10 as at © above, that gives 



share value: 
expected share value 
in one month: 






1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


0.50 


1.00 


2.00 


3.00 


4.00 


5.00 


6.00 


7.00 


8.00 


9.00 


9.50 



(8) 



Since the current values of v,p,c are known at the beginning of each month 
(at the beginning of each turn, more generally), this maximising strategy can 
be applied in practice provided the fixed-point can be approximated sufficiently 
well. For our current game, comparing © and © confirms our guess above 
about the problem with Strategy (O : instead of its recommendation, our initial 
move should be "make an immediate reservation if w > 6, otherwise wait." 

In general, if we follow lO consistently we will realise at least the optimum 
^ over sufficiently many trials. 
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5.5 Other games 

Variations on ^ can describe the value of other, related games. 

It might be for example that a client's instructions are "get me the shares 
when they're worth at least $6," and the investor's aim is to maximise his chance 
of doing that. Let atLeast6 denote the characteristic function^^ of those states 
where w > 6; then 

{^X ■ {month}atLeast6 U {month}(X n {month}X)) . 

gives a lower bound for /'s probability of achieving v > 6 with an optimal 
strategy. By analogy with lO — the same theorem applies — that strategy 
should be 

make an immediate reservation just when the probability of 
achieving v > 6 next month is at least as great as the optimal. 

Below we tabulate the probabilities, giving for contrast the results of the 
strategy "reserve when v > 5 and p > 0.5," i.e. the intuitive approach of waiting 
until the chance of achieving v > 6 next month is at least even: 



probability of achieving v > 6 



initial share value: 
following optimal strategy: 
following intuitive strategy: 






1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


0.25 


0.29 


0.34 


0.41 


0.46 


0.50 


0.56 


1.00 


1.00 


1.00 


1.00 


0.25 


0.28 


0.33 


0.37 


0.42 


0.50 


0.50 


1.00 


1.00 


1.00 


1.00 



We can see from the table that when v is $5 initially, the intuitive strategy 
is optimal: "reserve now." At $6, however, the optimal strategy — counter- 
intuitively — is to wait. 



5.6 More generally 

In the next section we show that the techniques used in this example are valid 
for all games — that is, that the value of any game of the form given in Sec. ^ is 
well-defined, that it can be realised by memoriless strategies if the state space is 
finite, and that its value corresponds exactly to the denotational interpretation 
of Sec.Ol 

For the current example, those results justified our using the denotational 
interpretation to analyse Game, which in this simple case led to a direct calcula- 
tion © of the optimal result, and the formulation of an explicit strategy (0) to 
achieve it. For more complex formulae, the optimal payoff can be determined in- 
directly using model-checking methods derived from Markov Decision Processes 
(MDP's) EI. 

For example, PRISM |^ is a probabilistic model checker which has support 
for MZ)P's.^^ It takes as input an occam-like description of a transition sys- 
tem, including both overlapping-guard style (traditional, in CSP parlance 

Recall Footnote m 

It also handles discrete- and continuous-time Markov chains. 



16 



AK Mclver and CC Morgan 



"internal") nondeterminism and (beyond occam / CSP) probabilistic choice con- 
structs. Using BDD-ha,sed techniques it translates the input description into an 
MDP, called / say. 

Normally, the tool allows the verification of MDP's against specifications 
written in the temporal logic pCTL |13| : in this case an extended version was 
used that supports reward-based specifications. The rewards are evaluated by 
approximating the least fixed-point of /□ or /□ by repeated applications be- 
ginning from bottom (zero), where /□ or /□ interprets (all) non-probabilistic 
nondeterminism as minimising, maximising respectively — i.e. "uni-modally" 
— and the interpretation of the result as a measure of the minimum or max- 
imum possible reward in a probabilistic/demonic or probabilistic/ angelic game 
is justified by Everett's original work |10| . 

To deal with the "bi-modal," minimax non-determinism of our example, the 
Pi?/5'M-produced transition matrices for the MDP were exported, and used as 
data for a MatLab® program that performed the angelic/demonic calculations 
explicitly; the results agreed with the calculations we had previously obtained 
from Mathematica® by coding up the qMfj, formula (O directly |41j . 

The justification in this more general case that the value can be interpreted 
as the minimax expected reward of the original game is provided by our Thm.^ 
below, extending Everett. 

6 Proof of equivalence 

In this section we give our main result, the equivalence of the operational, 
"Stirling-game" and the denotational, "Kozen-logic" interpretations of qMfx for- 
mulae. We formalise strategies in both cases, whether they can or cannot have 
"memory" of where the game or transition system has gone so far, and the effect 
of ^^minimaxing" over them. 

To begin with, we fix a single pair of strategies: one maximising, one min- 
imising. 

6.1 Fixed strategies for the Stirling interpretation 

Our first step will be to explain how the games of Fig. O can be formalised 
provided a fixed pair of players' strategies is decided beforehand. 

The current position of a game — as we saw in Sec. 0] — is a formula/state 
pair. We introduce two strategy functions called g_ and a, which will prescribe 
in advance the players' decisions to be taken as they go along: the functions are 
of type "finite-game-path to Boolean," and the player Min (resp. Max), instead 
of deciding "on the fly" how to interpret a decision point □ (resp. U), takes the 
strategy function a (resp. a) and applies that to the sequence of game positions 
traversed so far. The result "true" means "take the left subformula," say. 

These strategies model full memory, because each is given as an argument 
the complete history of the game up to its point of use. (Note that the history 
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includes the current state s.) We stipulate however that strategies are colour- 
insensitive, since colours are not an artefact of the system itself:^^ that is, we 
assume that from any colour C it is possible to recover the identity of the variable 
X for which it was generated, and any strategy treats game position (C, s) in a 
history as if it were {X, s).'^^ 

We can now formalise our probabilistic extension of Stirling's game. Rather 
than see it as at our earlier Fig.UJ a linear sequence of moves interleaving max- 
imising, minimising and probabilistic choices, we use our strategy functions to 
present the game in two separated stages. 

In the first stage we construct a (possibly infinite) purely probabilistic game- 
tree, using the given formula 0, the initial state s and the pre-packaged strat- 
egy functions a, a. The process is shown in Fig. [3 and clearly is derived from 
the game of Fig. 13 given earlier: the difference is that in Fig. El the probabilis- 
tic choices are "deferred" by our showing the whole tree of their possibilities, 
whereas in Fig. Othey are "taken as they come." We write l^ly'^^.s for the tree 
generated by the process of Fig. [S] 

For the second stage we play the purely-probabilistic game represented by 
the tree just generated, and use the function Val of Def. ^ from valid game 
paths to the non- negative reals, to determine the "payoff" as described at the 
end of Sec. 21 Our "expected payoff" from the whole process is then the expected 
value of this payoff function over the distribution of paths determined by that 
game tree, formalised as follows. (Abusing notation, we write l^ly'^^.s for the 
probability distribution of paths determined by the tree, as well as for the tree 
itself.) 

Definition 2. Value of fixed-strategy Stirling game — The value of a game 
played from formula (j) arid initial state s, with fixed strategies a, a, is given by 
the expected value 



of Val over the (probability distribution determined by the) game-tree |(/)]p''^.s 
generated by the formula, the strategies and the initial state as shown in Fig. [31 
(The argument that this is well defined is given in Lem.\^ following.) 

Lemma 3. Well-definedness of Def. (3 — The expected value of Val over game 
trees is well-defined. 

Proof. We must show (1) that |(/)]^''^.s generated as at Fig. {^determines a sigma- 
algebra, and (2) that Val is measurable over it. We use in several places that the 

That is, since colours do not occur in the physical systems we are specifying, we are 
not obliged to model strategies that take them into account. 
^"^ Thus strategies do not depend on the actual colour value that was arbitrarily chosen 
during a fixed-point step. All that matters is whether colours are the same or differ, 
and which kind of fixed point (least or greatest) generated them. 
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As for Fig. 121 the current game position is some {(j), s) ; but here we appeal to pre- 
determined strategy functions g_,a, and use a "current path" variable n, to construct 
the whole probabilistic tree of possibilities rather than to play along one of its branches 
as we go. 

After each step, the path n is extended with {(j),s); it is initially empty. The formula 
and state change as for Fig. but according to the given strategies if appropriate. 

1. If </) is a free variable X, make a single probability-one edge leading to tip (V.X.n.s). 

2. If is A then make a single probability-1 edge leading to tip (V.A.s). 

3. if (j) is {k}$ then make one edge for each state s' having V.k.s.s' non-zero, labelling 
it with that probability, plus one more "payoff" edge if those probabilities sum to 
less than one. For each s' edge, add a child s'); if there is a payoff edge then 
add a child (y) where y is the payoff V.k.s.$/(1 — X^a' s V.k.s.s')." 

4. If (p is (P' n <P" (resp. U <?>") then choose between and depending on ct.tt 
(resp. ct.tt): form a single edge of probability one to the next game position (^, s), 
where $ is the chosen 'junct or 

5. If (/) is O G > , choose between ([>' and depending on V.G.s: form a single 
edge of probability one to the next game position {$,s), where <f is the chosen 
'junct. 

6. If (j} is i/J-X • <1>) then choose fresh colour C; make a single probability-one edge 
leading to (C, s). 

7. If (j) is (vX ■ (f) then (as for choose fresh colour C; make a single probability-one 
edge leading to (C, s). 

8. If (f> is colour C, extract the game position {{ii/vX ■ <f),s'); make a single 
probability-one edge leading to {$[X i— + C], s). 

App. ^ at explains how the operations of choosing and binding colours are for- 
malised in this denotational definition. 

We write [(?!)]^'''.s for the tree generated, as above, from formula (p, strategies o_,a and 
initial state s. 

Fig. 5. Tree-building process, with paths and strategies. 

" If the probabilities over 5* sum to one then we do not add a payoff edge — so the 
question of division by zero does not arise. 
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tree is finitely branching, and that therefore it has only countably many nodes 
(and hence only countably many finite paths). 

For (1) we appeal to the standard construction of path distributions from 
trees: the basis elements are "cones" of paths all having a common (finite) prefix; 
and the measure of a cone is the product of the probabilities found on the path 
leading from the root to the end of the common prefix ( equivalently, to the base of 
the cone). The algebra is generated by closing the basis under countable unions 
and complement (and hence countable intersections also). 

For (2) we must show that for any real r the inverse image Val"^ .{r,(yo) is 
in the algebra defined at (1), where {r,oo) is the open interval of reals above r. 

We begin with the case < r < 1, in which case the inverse image is the set 
of all paths containing an infinite number of v- colours plus all the (finite) paths 
ending in an explicit s-tip with r < s. But since there are only countably many 
finite paths in total, there are certainly only countably many (s) -tipped paths — 
which we can therefore ignore. 

Since each new colour (of either kind) is generated at some node of the tree, 
there are only countably many v-colours, and so we may concentrate on a single 
v-colour C. 

For any i >Q the set C ; of paths with at least i occurrences ofC is measurable, 
since it is the union of all cones determined by finite prefixes ending in an i*^ 
occurrence of C exactly. Then the set Coo of paths containing infinitely many C's 
is just the countable- over-i intersection of all the Ci 's. 

We finish by noting that for the case 1 < r the inverse image is empty; and 
for the case r = it is just the set of all paths. 

Although the game is played "all at once" in Fig. |2| note that the strategy 
functions and the construction of Fig. [3 make it appear as if it is played in two 
stages: first, we determine the strategies; second, we roll the dice. The point of 
that is to allow us to use standard techniques of expected values in the second, 
purely probabilistic stage, free of the complications of max/min-nondeterminism. 
The strategy functions' generality makes the two views equivalent. 

6.2 Fixed strategies for the Kozen interpretation 

Now that the value of a fixed-strategy game is defined, our second step is to 
define fixed-strategy denotations: we augment the semantics of Sec. 13 with the 
same _strategy functions as above. For clarity we use slightly different brackets 
11011^''^ for the extended semantics. 

The necessary alterations to the rules in Fig.[21are straightforward, the prin- 
cipal one being that in Case^ instead of taking a minimum or maximum, we 
use the argument or as appropriate to determine whether to carry on with 
(j)' or with (j)" . 
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A technical complication is then that all the definitions have to be changed 
so that the "game sequence so far" is available to a_ and when required. That 
can be arranged for example by introducing an extra "path-so-far" argument 
and passing it, suitably extended, on every right-hand side. 

The modified rules are given in fuU in App.Elat Fig.EI 

6.3 Equivalence of the interpretations 

We now have our first equivalence, for fixed strategies: 

Lemma 4. Equivalence of fixed-strategy games and logic — For all closed 
qMjjL formulae (j), valuations V, states s and strategies a, a, we have 

J Val = mi^'^.s . 

Proof, (sketch^^) The proof is by structural induction over (p, straightforward 
except when least- or greatest fixed-points generate infinite trees. In those cases 
we consider approximations to the valuation function Val such that Val^.n acts 
as Val if path n contains less than n occurrences of colour C, otherwise returning 
zero (resp. one) for the fi (resp. v) cases respectively. Those n-approximants in 
the game interpretation are shown by mathematical induction to correspond to 
the usual n-fold iterates that approximate fixed points in the denotational inter- 
pretation; and bounded monotone convergence is used to distribute suprema 
(for least fixed-points ) through J . 

For similar distribution of the infima required by greatest fixed-points, we 
subtract from one and again argue over suprema. 

Lem.0]will be the key to our completing the argument — in Sec. l().4l to follow 
— that the value of the Stirling game is well-defined when we take the minimax 
over all strategies of the expected payoff, rather than just considering a fixed 
pair. That is, in the notation of this section we must establish 

J Val = u^n^ J Val. (9) 

The utility of Lem. 01is that it allows us to carry out the argument in a denota- 
tional rather than operational context — we can avoid the integrals, games and 
trees and simply use ||| • ||| and cpo's instead. 

In fact we show ^ to be even simpler — both sides are equal to the original 
denotational interpretation, with its n and U operators still in the formula and 
therefore no need for strategy functions at all. That is, we prove ^ by appealing 
to Lem. 0]to move from J's to ||| • |||'s, and then we will establish the equality 

n^u^|||0|||5'^ = ||0||v - u^nJ0|||^^^. 

And so we will have that the game is indeed well-defined — and that \\4>\\v is its 
value. 

■^^ A full proof is given in Add. 1X1 
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6.4 Pull equivalence via memoriless strategies 
for finite state spaces 

In the previous section we handled maximising/minimising strategies by mod- 
elling them explicitly as a fixed pair of "decision" functions chosen beforehand. 
Here we show that the order in which they are chosen makes no difference: 
whether maa;-before-mm or the reverse, the Stirling-value of the game equals 
the Kozen- value of the original formula, i.e. with the U/fl operators still in place 
and no explicit strategy functions. 

A key step in that process is showing that, over a finite state space, there are 
fixed "memoriless" strategies that "solve" a Kozen interpretation in the sense of 
achieving its value by local decisions that depend only on the current state and 
not on the history; such strategies are implemented by Boolean conditionals. 

Our approach is a generalisation of an argument used by Everett, who treated 
formulae with a single least fixed-point ^lOj : we have generalised it to deal with 
multiple fixed points nested arbitrarily. 

Let formula 0g be derived from by the syntactic operation of replacing each 
operator n in (/) by a specific predicate symbol drawn from a tuple G of our choice, 
possibly a different symbol for each syntactic occurrence of This represents 
replacing the general minimising strategy n by some specific memoriless strategy 
(-ies) G that G denotes. 

Similarly we write (jy^ for the derived formula in which all instances of U are 
replaced left-to-right by successive predicate symbols in a tuple G. 

With those conventions, we will appeal to Lem.^]of App.^that for all qM^i 
formulae (j) over a finite^^ state space S, and valuations V, there exist (semantic) 
predicate tuples G and G corresponding to the predicate symbols as above such 
that I'/'gIv' = I't'lv = \<I>q\v' , where V is the technical extension of V that maps 
the new symbols G, G to G, G respectively, and leaves all else unchanged. 

For example, if the formula is 

{^iX ■ Ai U {vY ■ A2 n {k}(A3 U (A < G t> Y)))) , 

then we are saying we can find predicate tuples (Gj^) and (Gi,G2) so that for 
corresponding predicate-symbol tuples G = (Gi) and G = (Gi, G2) we can define 

(j)Q ^ ifiX • Ai U (t/F • A2 < Gi t> {k}(A3 U (A < G > Y)))) and 

= (^A • Ai < Gi [> (i^y • A2 n {k}(A3 <] G2 1> (A o G t> r)))) 

— and then extend V to a V' that takes Gj^, Gi, G2 to Gj^, Gi, G2 respectively — 
so that (j)Q, (j>Q and </> itself are all || • ||v'-equivalent. ^"^ 

Finiteness is needed in Case G of the lemma's proof. 
^'^ It is easy to show also that all three formulae are then || ■ ||v' -equivalent to ^ , but 
we do not need that. 
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The proof of Lem. lTTl is by induction, intricate only in one case, which is where 
we rely on Everett's techniques [op. cit].'^^ That part of the proof, together with 
several preliminary lemmas, is given in Appendices IbHdI 

With it we show, first, that the Kozen interpretation is insensitive to the 
order in which the strategies are chosen; then, from Lem.^we have immediately 
that Stirling games are similarly insensitive — and thus our main result, that 
the value of the game is the value of the denotation. 

Lemma 5. Minimax equals maximin for Kozen interpretation — 
For all qMjjL formulae (j), valuations V and strategies g_, a, we have 

n^uMir = u^n^i^i^'^. (10) 

Proof. From monotonicity, we need only prove Ihs < rhs.^^ Note that from 
Lem. El we have predicates G and G satisfying 

miv - Hiv = mv , (11) 

with V' extending V as we have said, a fact which we use further below. 

To begin with, using the predicates G from ifiiji . we start from the Ihs of ^lUp 
and observe that 

n^u^|||</.|||5'^ = n^u^|||</.|||5f < I^gIII^, , (12) 

— in which on the right we omit the now-ignored g_ argument — because ( on the 
left) formula cj) does not refer to the extra symbols in V' and (on the right) the 
□ct can select exactly those predicates G referred to in V' by G simply by making 
an appropriate choice of a. 

We then eliminate the explicit strategies altogether by observing that 



< mv , (13) 



because the simpler \\ \\ -style semantics on the right interprets U as maximum, 
which cannot be less than the result of appealing to some strategy function a. 
We can now continue on our way towards the rhs of as follows: 

\4>q\\)' carrying on from 11'^ 

= I4'\\v' Jii"st equality at |j j|) 

~ I'/'gII^' second equality at 

— |||<?!>g|||y/ as for above, backwards and with inequality reversed 

< Ho- III 011 y'"' , as for 11 'A) above, backwards and with inequality reversed 

and we are done. (Note that in the last step we were again able to use the fact 
that 4> is insensitive to the difference between the extended valuation V' and the 
original valuation V .) 



Unfortunately Everett's work as it stands is less than we need, so although we borrow 



his techniques we cannot simply appeal to his result as a whole. 

2« Trivially n^u^|||<^l|^'^ = |||0|||5'^ > n^|||,^|||5-^ . 
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The proof above establishes the duality we seek between the two interpreta- 
tions, and our principal result: 

Theorem 1. Value of Stirling game — The value of a Stirling game is well- 
defined, and equals \\(j)\\v- 

Proof. Lem. ^ and Lem. [3| establish the equality 0J, for well-definedness; the 
stated equality with \4>\\i occurs during the proof of the latter. 

Finally, we have an even tighter result about the players' strategies: 

Lemma 6. Mcnioriless strategies — There exists a memoriless strategy G 
which, if followed by player Max, achieves the value of the Stirling game against 
all strategies of player Min. (A similar result holds for player Min.) 

Proof. Directly from Lem. El '"^'^ Thm. 
7 Conclusion 

Von Neumann and Morgenstern |46j proved the minimax theorem for zero-sum 
two-player games comprising one kind of play in a single game. Everett 
extended this to "least-fixed point" games, i.e. an unbounded number of plays 
of a finite number of possibly different games that can recursively call each other 
within a single "loop." For the special case where those games are turn-based, 
we have extended that result further to include both least- and greatest fixed 
points, and arbitrary nesting. 

Our reason for doing this was to introduce a novel game-based interpretation 
for the quantitative /i-calculus gM/i over probabilistic/angelic/demonic transi- 
tion systems, probabilistically generalising Stirling's game interpretation of the 
standard /i-calculus; we aimed to show it equivalent to our existing Kozen-style 
interpretation of gM/i, and so to provide an "operational" semantics. 

The equivalent interpretations are general enough to specify cost-based prop- 
erties of probabilistic systems — and many such properties lie outside standard 
temporal logic. The Stirling-style interpretation is close to automata-based ap- 
proaches, whilst the Kozen-style logic (studied more extensively elsewhere I30|) 
provides an attractive proof system. 

Part of our generalisation has been to introduce the Everett-style "payoff 
states" $ into Stirling's generalised games. Although many presentations of prob- 
abilistic transitions (including our earlier work) do not include the extra state, 
giving instead simply functions from S to S which in effect take the primitive el- 
ements of formulae to be probabilistic programs, here our primitive elements are 
small probabilistic games [H^. The probabilistic programs are just the simpler 
special case of payoff zero. The full proof of Lem. 1111 makes that necessary, 
since we treat the G/v case via a duality, appealing to the G//* case. But it is a 
duality under which probabilistic programs are not closed, whereas the slightly 
more general probabilistic games are closed. Thus we have had to prove a slightly 
more general result. 
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An interesting possibility for further work is the use of intermediate fixed 
points, yielding say a value < a; < 1 rather than the fixed zero-for-least and 
one-for-greatest that are traditional. For expectation transformer t we would 
propose the definition 

i\x^.t = lim r.x , (14) 

n — 'oo 

where x.s = x, so that (for continuous t at least) /i.t, i^.t become the special cases 
of zero and one for x, i.e. fixQ.i and fixi.i. When t is purely probabilistic (thus 
almost linear), it can be shown that (|14|l is meaningful {i.e. converges) for any 
< X < 1, and agrees with fi, v where it should. ^° 

We do not know however whether convergence is guaranteed when t may 
contain angelic or demonic nondeterminism. 

The utility of fix^ is when infinite behaviour is to attract a reward which is 
neither zero nor one. In the game interpretation we would collapse Cases EF^ of 
Def . n to the single 

121 The path tt is infinite and there is a colour C appearing infinitely often that 
was generated by fix^; for some x\ in this case VaLu is x. 

In the logical interpretation we would use (|14|) just above. 
+ For the investor of Sec. [Slit might be that his reservation costs some fixed $a;, 
so that infinite behaviour (never reserving) is awarded $a; rather than zero (i.e. 
he keeps his money). A more advanced use would be that he seeks to maximise 
his profit,^^ defined to be the difference vi — vq, where vq is the market value 
V when he reserves, and vi is its value one month later (when the shares are 
delivered, and he can sell). Because vi — vq could be negative, we would shift- 
and-scale to transform the expectations into the range [0, 1], with the effect that 
the zero awarded for "never reserves" would be transformed to 0.5. 

8 Related work 

Probabilistic temporal logics, interpreted over nondeterministic/probabilistic tran- 
sition systems, have been studied extensively, most notably by de Alfaro [HI, 
Jonsson Segala 021 and Vardi Condon ^ considered the complex- 
ity of underlying transition systems like ours, including probabilistic- (but 1/2© 
only), demonic- and angelic choice, but without our more general expectations 
and payoffs. Monniaux ,26, uses Kozen's deterministic formulation together with 
demonic program inputs to analyse systems via abstract interpretation 

The pCTL of Aziz pP and Hansson and Jonsson provides a threshold 
operator which allows properties such as "</> is eventually satisfied with proba- 
bility at least 0.75," where the underlying distribution is over execution paths. 

^'^ Using a constant expectation x is necessary, as lim„^oof".e does not converge in 
general if expectation e may vary over the state. 

For example let S = {0, 1} and take t to be (the transformer corresponding to) 
s: = 1—s for s £ S, with e.s = s; then t" .e.s = {s + n) mod 2. 
Recall Footnote IT^ 
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Similarly Narasimha et. al. Plj use probability thresholds, and restrict to the 
alternation-free fragment of the /z-calculus; for that fragment they do provide an 
operational interpretation which selects the proportion of paths that satisfy the 
given formula. Their transition systems are deterministic. 

Though the quantitative /i-calculus has received much less attention, its use 
of expected values allows a greater variety of expression — in particular, it can 
specify properties that are inherently cost-based. 

Huth and Kwiatkowska ^2] for example use real-valued expressions based 
on expectations, and they have investigated model-checking approaches to eval- 
uating them; but they do not provide an operational interpretation of the logic, 
nor have they exploited its algebraic properties |30| . 

De Alfaro and Majumdar [Sj use qMfx to address an issue similar to, but not 
the same as ours: in the more general context of concurrent games, they show 
that for every LTL formula ^ one can construct a gM/i formula </) such that 
||0||v is the greatest assured probability that Player 1 can force the game path 
to satisfy if". 

The difference between de Alfaro's approach and ours can be seen by consid- 
ering the formula 'F = {^X • {k}atB U {k}X) over the transition system 

V.k = is: = A 1 /9® s: = B) if {s ^ A) else {s: = A) 

operating on state space {A^B}. (In fact formula W expresses the notorious 
AFAXatB 0S] in the temporal subset [HOI of gM/i, where V.atB.s is defined to 
be 1 if (s = S) else . Player 1 can force satisfaction of with probability one in 
this game, since the only path for which it fails (all A's) occurs with probability 
zero; so de Alfaro' construction yields a different formula (j) such that ||0||v = 1. 

Yet l^'lv for the original formula is only 1/2, which is the value of the Stirling 
game played in this system. It is "at each step, seek to maximise (U) the payoff, 
depending on whether after the following step ({k}) you will accept atB and 
terminate, or go around again (X)." Note that the decision "whether to repeat 
after the next step" is made before that step is taken. (Deciding after the step 
would be described by the formula [piX • {k}(atB U X)).) The optimal strategy 
for Max is of course given by = {^iX • {k}atB if atA else {k}X). 

Finally, our result Lem.^jfor memoriless strategies holds for all gM/i formulae, 
whereas (we believe) de Alfaro et. al. treat only a subset, those formulae encoding 
the automata used in their construction. 

More recently, de Alfaro has given theorems for equivalence of game- and de- 
notational interpretations of quantitative /i-calculus formulae for "discounted" 
two-player games, provided the formulae are "strongly deterministic" jH]. Strongly 
deterministic is a syntactic criterion that restricts to formulae that avoid the dif- 
ference we illustrate above: that is, their game-value, as we define it, and their 
"proportion of paths LTL-satisfying" value (as above) are in agreement. Dis- 
counted (turn-based) games, in our terms, are a special case of our Everett-style 

The restriction also excludes for example the case study of Sec.|Kl where our interest 
is genuinely in a game's minimax value, rather than in the probability of satisfying 
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payoff states in which the probability of transition to $ is the complement 1 — a 
of the discount factor a, as illustrated in Fig.^ 
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Jjem.^ Logic/game equivalence of fixed- strategy interpretations states that 

For all closed qMfi formulae 0, valuations V, states s and strategies ct, ct, 
we have 



where Val is given by Def . ^ the tree-building semantic function |-] is as 
given in Fig. |5l and the strategy-extended denotational semantics ||| • ||| is 
given at Fig. below. 

Proof. We use structural induction over a stronger hypothesis including explicit 
paths (at ((T?)|l below), straightforward except when least- or greatest fixed-points 
generate infinite trees; in each case the current formula will be 0, and its con- 
stituent formula(e) will be (with primes if necessary). During the proof we 
formalise the use of strategies in both interpretations, extending both semantic 
functions with a "path" argument of type 7T, say, which records the steps as 
the formula is decomposed and is used in the U(n) case as the argument to the 
strategy a{a). 

The tree construction within the inductive argument introduces two new fea- 
tures: (a) that the current tree may in fact be a subtree, depending from some 
path tt: II in the overall tree corresponding to the original formula; and (b) that 
even though the whole tree is built from a closed formula, we must consider 



A Full proof of Lem. HI from Sec. 16.11 




(15) 
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free variables in the inductive argument because it descends into the body of 
fixed-points. 

The first feature (a) affects the use of the strategy functions: when resolving 
a ri-choice, say, the path passed to the history-dependent minimising strategy 
a must be the path from the overall root, that is the current path within the 
subtree appended to the path tt from which the whole subtree depends. Thus we 
supply a path as an extra argument to the tree-generating function, that is we 
write |(/)]^''^.7r.s, following the convention that tt does not include the current 
position {(pjs).^^ If tt is omitted (as in the statement of the lemma) then it is 
taken to be the empty path (). 
-|- The explicit path argument also provides a neat formalisation of the colour 
operations: we simply let the colours be subscripted variables, creating colour 
Xi from bound variable X where i is the length of the path n at the point the 
fixed-point formula binding X is encountered. Then to look up colour Xi at some 
later point tt' extending tt, we simply take the i*^ element of tt' — it will contain 
a fixed-point formula — and we construct <P[X i— > Xi] for the formula retrieved. 

Strategies achieve colour-insensitivity by ignoring the subscript, treating po- 
sition {Xi,s) as just {X,s). 

For the second feature (b) we assume that all free variables X in the current 
formula are defined in the valuation V, taken to functions of type U S ^ [0, 1]; 
note that these functions deliver real values, not subtrees. If we encounter X 
when building the tree from current path tt and state s, we look up the value 
X in V to get a function /, and then insert the leaf node (/.tt+.s) directly into 
the tree at that point, where 7r+ is path tt routinely extended (as in Fig. El for 
the Kozen semantics) with the current game position, in this case {X,s). The 
intention is that the stored function / "short-circuits" the continued play from 
{X, s) after path 7r+ : it simply supplies the value directly. 

Note that our extended tree-building looks up free variables X in the valua- 
tion V ultimately to give a real number x which is inserted as a leaf-node (x), 
whereas colours Xi refer to position i in the path tt to give a formula <l>[x^Xi] 
from which the tree-building then continues. A summary of the process was 
shown in Fig. El and the game was given in Fig. (21). 

The extended Kozen semantics |||(/)|||^''^.7r.s also accepts strategy sequences a, a 
and a path argument tt, and in the definitions the path argument is routinely 

The alternative approach of passing pre-determined strategy sequences — for ex- 
ample, an infinite sequence of Booleans each meaning "go left" or "go right" and 
consumed as it is used — is not available to us. 

Normally one argues that such sequences achieve full access to the history because, 
in pre-selecting say true or false for a given position in the strategy sequence, one 
has already made all the earlier selections — and from those the formula/state that 
the current Boolean must deal with can in principle be determined. In our case the 
probabilistic choices are taken as the game is played, and the current formula/state 
cannot be predicted: thus the strategy functions take an explicit path argument in 
order to look back and see how earlier probabilistic choices were resolved. 
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1. iixiiir = V.X . 

2. ||Al||^'".7r^ = V.A.s . 

3. imn^'^.n.s = V.k...$ + / |||'?'|||^'^.^+ . 

V.k.s 

4. \\$' n$"\\^-^.TV = |||<P'|||5'^7r+ ifa.7r+ 

|||<P" |||^''^.7r+ otherwise. 

5. \\<P' <iG><P"\\fi'^.TT.S = |||<^'|||^'^7r+.s ifV.G.s 

l\<P"\\^''^.TT+.s otherwise. 

6. winx-mr-^ = Ofp^-mivZ^j-^^ 



.TT 



7. \l(uX.$)\ir.n = (gfpa..|||'^lllv[,„., 

8. (Colours are not used in ||| ■ ||| semantics.) 

The extra argument tt is a sequence of game positions, called a path; in each case path 
7r+ is defined to be tt extended with the game position {(j},s), where </!> is the entire 
formula on the left-hand side. 

Note that in Case Q the value V.X retrieved from the environment is applied to the 
current path and state; in Case |5| however, only the state is used. 

The strategy functions a, g_ are passed the current path when required (in Clause 01 
where we give only the FI/ct case). 

The type of x in the fixed-point clauses 16171 is path to state to [0, 1]. 

Fig. 6. Path/strategy-extended Kozen semantics; compare Fig.|21 



extended step-by-step so that it simulates the path that would be encountered 
in the corresponding tree; see Fig. Again, an omitted path defaults to empty. 

The inductive argument thus treats the stronger hypothesis which includes the 
above features; it is that for all qM^ formulae (j), valuations V, paths tt, states s 
and strategies a, W, we have 

'Val = m^-^.TT.s, (16) 

provided all free variables in are mapped by V to functions of type 77 — > ^ 
[0, 1] and that all colours in (j) are mapped to formulae by tt. Our original goal 
p5|l is the case of (|16ll in which V defines only language constants and tt is 
empty. 



We now give a representative selection of the cases in the inductive argument. 

Recall that neither colours nor free variables appear in the original formula, which 
is why the specialisation of to l|15|l is appropriate. 
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Base case cj) is X — From Fig. |21we have that the game-subtree |X]^''^.7r.s 
is just the tip (V.X.tt.s), which value we note from the typing of V given just 
before 1)16(1 is indeed a real r, say, in [0, 1]; from Def. QJof Val we then have that 
the left-hand side of (|16|l has value r. 

From Case n of Fig. we have that the right-hand side is 



ll^lllf 



.TT.S 



{V.X).Tr.s 



i.e. is r also. 



Base case (j> is A — Here the game-subtree |A]^''^.7r.s is just the tip (V.A.s), 
which is correctly typed because V takes constant-expectation symbols to func- 
tions in S* ^ [0, 1]. The path is ignored. 

From Case|51of Fig.Elwe have that the right-hand side is ||A|||y '^.tt.s = V.A.s , 
in which again the path is ignored. 

Inductive case (j> is {/c}^ — The game-subtree |{k}<?]^''^.7r.s has (</>, s) at its 
root, and is extended by a finite number of branches, one to each possible next 
state s' plus one to the special payoff state $. Beneath branch s', which has 
probability V.k.s.s', is the subtree "^.tt+.s' where 7r+ is tt extended with 
((/), s) to record the node just passed through; and branch $, which has probability 
1 — V.k.s.s', is terminated by (y) where the real value y £ [0, 1] is the payoff 
V.k.s.$/(1 - Y.s':s V-k.s.s') as at ^. 
We now have 

f Val 



V.k.s.: 



for Val' derived from Val: see (f) below 
(Es':5 V-k-S-s' X / VaV ^ ) 



V.k., 



VaV , Va/ prefix-insensitive: see (J) below 
(E.':s V.k.s.s' X / Val_ ^ ) 



V.k.s.: 
V.k.s.: 



{Y.s':s V-k-^-^' X |||<?|||p^7r+.s') 



/ m 

v.k.s 



W .TT+.S' 



structural induction 



Fig.ia 
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as required for this case. 

In the deferred justifications we are simply using the way in which expected 
values operate over tree-based distributions. We note that the expected value 
E of Val over a (sub-)tree T is the sum over its immediate children Ti of 
the expected value Ei assigned to Ti times the probability pi labelling the 
branch i that leads to it: that is, E = J2iPi ^ ^i- The expected values for 
the children are calculated just as for the parent, except that as we examine 
each child on its own, from its root, we must use the function Val' defined 
Val'.ir' = Val.{"{(j), s) followed by tt'") instead of the original Val, to take ac- 
count of the fact that we have passed through node (0, s) before reaching the 
child. 

We can however exploit the nature of our particular Val, that it is not affected 
by adding finite prefixes to its argument; thus we can immediately replace Val' 
by Val again. 

From that point on, the calculation of expected values JVal behaves as above. 

Inductive case (f) is <l>' H <1>" — The game-tree again has (0, s) at its root, but 
is extended with a single probability-one branch leading either to |<?']y '^.tt+.s 
or |<?"]^''^.7r"''.s depending on whether ct.tt"'" is true (take (p') or false (take <!>"). 
Note that the state is not changed, and that the strategy function is applied to 
7r+ (not tt), so that it has access to the current formula and state. 

Inductive case 4> is {fj,X • <I>) — Here in Fig. we appeal to U-continuity^^ 
to write the right-hand side as a limit 

(Un • /".0).7r+.s where f.x = 

and O.tt'.s' = for all tt' and s', 

after which we will show by mathematical induction that for all n, states s' and 
all extensions tt' of tt"*" we have 

r.O.TT'.s' = J Val^^ (17) 

for suitably defined approximants Val^^ of Val, where Xi is the colour chosen 
at position {(j), s) during the tree-building when the fixed-point formula was 
encountered. 



Because we have both least- and greatest fixed points, the justification of this as- 
sumption is not the usual "continuity is preserved by the operation of taking fixed 
points": for example U-continuity is not necessarily preserved by v. 

In fact we have analytic continuity, which over [0, 1] implies U, fl continuity, from 
Lem. 1111 see the remark about its being maintained inductively, at (-I-) in the proof. 
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Our overall conclusion will follow by taking limits on both sides, appealing 
to bounded monotone convergence |12j to distribute through / on the right. 

Define Val^' .n' for any path tt' to be just Vain' provided tt' contains fewer 
than n occurrences of colour Xf, if however tt' contains at least n occurrences of 
Xi, define Val^Kir' to be zero instead. We have (Un • Val^^) = Vol because for 
all tt' with only finitely many Xi we have Val^' .it' = Vain' for large-enough n; 
and for those vr' with infinitely-many Xi we have zero in both cases. 

We now give the proof of (|17|1 . by induction over n: in Case 0, both sides are 
zero. 

In Case n + 1, we reason that for all s' and extensions tt' of tt^ we have 

/"+i.0.7r'.s' 
/•(/"•0).7r'.5' 



llv' .tt'.s' definition f 



J Val structural induction 

IS>K-" .tt'.s' 



37 



J Val inductive appeal to 1171 — for all extensions tt" of tt', 

it' s' and states s" , define 

g.n".s" = / Val^^ 

I*[Xh^x,]1v''-^"-'*" 

so that g = /".O 



Because 4> contains no Xi, and n' extends tt, 
tree 1^]^'^ .tt'.s' made from them 
will contain no Xi's either; 
thus replacing Val by Val^_^i 
will make no difference; 
see (f) below. 

see (t) below 

thus establishing the inductive case. 

f For the first deferred justification, we note that X^'s can come from only three 
places: (1) from <P itself (but <P contains no Xi, since Xi was fresh); (2) from the 

Here is where we use the fact that Val is defined to yield zero if a /i-colour occurs 
infinitely often. 

Here is where we use the extended hypothesis 116II . rather than the original II15II . 
because may contain free variable X. Also, we rely here on "for all V" being part 
of the inductive hypothesis, since we are using V[X i—> /".O]. 



/ yai^+i , 
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interior of formulae retrieved from tt by looking up other colours in <Z> (but tt 
contains no "embedded" Xi either, again because it was fresh); or (3) from the 
subsequent creation of colours (but they themselves will be fresh, different from 
Xi, by construction — guaranteed by the fact that the length of tt' exceeds the 
length of TT and that the length determines the subscript of the any newly-created 
colour). 

That is the import of "choose a fresh colour" in the tree-building algorithm. 

I For the final step we are claiming, roughly speaking, that at all the points in the 
constructed tree where X occurs (left-hand side) or "used to be" (right-hand 
side, now replaced by X^), the function g was defined precisely so that it makes 
no difference to the integral JVal^_^^ whether we 

1. look up variable X in V[xi^g] to get g, which applied to the path tt" and 
state s" at that point gives a tip {g.TT" .s") directly, or 

2. look up colour Xi in path tt", to recover the formula <P^x^Xi] ^-nd carry on 
building the tree below. 

That is, the value in the tip constructed at is exactly the value realised from 
the tree constructed at (0) by the integral J Val^j^^. 

In more detail: we are in fact relying on an elementary property of f F over 
game-trees, for general F. Take any game-tree T, and describe subtrees of it as 
pairs (tt, U), where U is (also) a game-tree and tt is the path leading from the 
root of T to just before the root of U. Let T[{tt, U) ^ V] be the tree resulting 
from replacing that entire subtree by another tree V . We then have that 

j F ^ j ^ ' ^ ' ^^^-^ 

T[{t:,U) V] T U V 

where F^.tt' = F.(7r4f vr') for all tt' .^^ In effect, on the right we subtract 
the contribution made by U and then add back the contribution made by V , 
but in each case we use F^^ over the sub-tree to compensate for the fact that 
its contribution is made within {i.e. at tt) the overall tree T. Furthermore, the 
above holds for any countable pairwise-disjoint set of such substitutions done 
simultaneously.'^^ 

Now for the final step in our proof above we reason backwards, from the 
last expression — call it [%] — to the second-last, [j], using an instantiation of 
P8|) . We unify [X] and the first term on the right-hand side of H18|) by choosing 
function F to be Val^^^, and the tree T to be l<P[x^Xi]lv'^ -t^' -s' ■ 

We write -H- for path concatenation. 

We require that the set of all paths affected is measurable, which is why we require 
countability of the subtrees. 
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Now Tree T contains (at most) a countable number, fc-indexed say, of "first 
encounter of Xi from the root of T" positions {Xi,s^), and each is the final 
element of some path containing no other Xi; below each tt*^ is some subtree 
U'' , which from our tree-construction procedure we know will be 

[<^[x^x.]ir-(^'^'^')-«' . (19) 

since $[x^Xi] is what is returned when we look up colour Xi and tt' -ff tt*^ is the 
overall path that leads to this point. (Refer Case|Slof Fig. [51) Onc-by-one we 
will use these [/'^'s as U in countably-many applications of H18|l . 

For each k the function F-^ in (|18|l . which we will call F^k , will be ( ya/^]^)^fc 
because of our choice above of F. But that is just Val^' , because tt^ contains 
exactly one Xi (at its end) which "uses up" the +1 in the subscript n+1. Thus 
for each k the second term on the right of 1)18(1 is the integral of Val^^ taken 
over Tree (I19II . viz. 

x'^ - 1 Val^' . (20) 

I<f[x„x.]l^-'".(^'4f7r'=).s'= 



Now for the the third term we choose the (to replace C/'^) to be the trivial 
subtree comprising just a tip (x'^); that makes / Val^' just x'' again. 

ix") 

With the second and third terms in (|18(l equal, the first term on its own 
(which we recall is [t]) equals the left-hand side. Figures [7| and |S1 illustrate the 
trees occurring in the left- and right-hand sides of ((TH)l . 

We will now show that the left-hand side of ((18|l is equal to [f] . The tree used 
there (Fig. Cjl is 

T[{tt°,U°) ^ V'',{tt\U^) ^V^---] , 

— i.e. the result of all the fc-indexed substutitions done simultaneously — and 
each V'' is just the tip {x''). But the tree l^lvj*^ j -tt'.s' used in [f] is the same 
except that it contains the tip {g.n^ .s^) at those places. (The places agree be- 
cause they are both determined by the occurrences of X in the original formula 
-P.) 

Comparison of the definition of g (at [f]) — noting its arguments at each fc 
will be tt" = tt'-H-tt'^ and s" = s'^ — and the definition of x^ (at (|2nj)) shows 
those tip-values to be equal. 

That concludes our justification of the final step above, and of our inductive 
proof of H17() as a whole. 

Using 1(17(1 we finish off the proof of this case as follows. Choose path 7r+ 
itself and state s; then with hounded monotone convergence we have 
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Tips V and V 
replacing original 
sub-trees U'and U' 



Fig. 7. Left-hand side of fH^: tree T after subtrees U'^ replaced by tips V''. 
(Un • /".0).7r+.s 

: {Un • J Val^^) _ from itTTIl in the special case vr' = 7r+ 

I*[x„x,ll^'''.^+.^ 

: / (Un • Val^^) bounded monotone convergence (*) 

/ Val _ (Un • Val^') = Val 

tree-building step for fj, (backwards); 
Xi looks up <l>[x^Xi] in tt"^ 

where the final step is the one in which colour Xi was generated. 

Inductive case (j) is {vX • — This case is essentially the same as the /z-case 
— we define the truncated valuations Val^^n' as before except that paths tt' 
with at least n occurrences of Xi are taken to one (rather than to zero) . 

A small complication however occurs in the use of bounded monotone conver- 
gence, which requires the sequence of valuations to be monotone non-decreasing: 

The equality (I17II is for all extensions n' of tt"*" because of its inductive proof: the 
stronger hypothesis is used when defining g. 



J VaL 
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Original game 
position. 



Current game position, 
with variable X replaced 
by fresh colour X, . 



Tree W 



(<I>[x^Xi],5'n 




Possible further 

s»of;f:. 



Tree U' 



Fig. 8. Right-hand side of l(TH|l: tree T before subtrees replaced by tips V''. 



at the point corresponding to (*) above we would in this case be arguing that 



(nn- Fa/^-) , 



where the terms are non-increasing. Since all the terms lie in [0, 1] however, we 
can deal with it by subtracting from one throughout, before and after. 



B Memoriless strategies suffice 
over a finite state space 



We show that for any formula 4>, possibly including n and U strategy opera- 
tors, there are specific state predicates (collected into tuples G and G) that can 
replace the strategy operators without affecting the value of the formula. The 
inductive proof is straightforward except for replacement of U within ^ (and, 
dually, replacement of □ within v). For this G//z case we need several technical 
lemmas and definitions; the other cases are set out at |2H1- 
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Because the argument in this section is mainly over properties of real-valued 
functions, we shift to a more mathematical style of presentation. Variables 
f,g,... denote (Curried) functions of type expectation(s) to expectation, and 
w,x, . . . are expectations in £S. For function / of one argument we write /i./ for 
its least fixed-point. 

Definition 3. Almost-linear — Say that an expectation-valued function f of 
possibly several expectation arguments almost-linear if it can be 

written in the form 

f.x.y---.z = w + g.x + h.y + ■ ■ ■ + i.z , (21) 

where w is an expectation and g,h, ■ ■ ■ ,i are linear expectation-valued functions 
of their single arguments. 

Lemma 7. Every n/U-/ree formula (j), possibly containing free expectation vari- 
ables X,Y,...,Z, denotes an almost-linear function of the values assigned to 
those arguments. 

Proof, (sketch) What we are claiming is that the function 
f.x.y---.z = \\4>\\v[x,y--z/X,Y-Z] 

can be written in the form given on the right at l^21\} . provided ip contains no □ 
or U. This is a straightforward structural induction over (j), given in full at WBj. 

Definition 4. Almost less-than — For non-negative reals a, b, write a <^ b 
for a > ^ a < b; write the same for the pointwise-extended relation over 
expectations. Note that a < b implies a <^ b implies a < b on this domain. 

Definition 5. ok functions — Say that an expectation-to- expectation function 
f of one argument is ok if for all expectations x with x <^ f.x we have that 
X < n.f . 

Lemma 8. // / is almost-linear then f is ok in each argument separately. 
Proof. See Appendix\^ 

Lemma 9. All Fl/U-free formulae 4> denote ok functions of their free expecta- 
tions X,Y, ■ ■ ■ , Z taken separately. 

Proof. Lemmas^ and\^ 

The following result forms the core of Everett's argument 10 ; note it does 
not depend on /'s being ok. 

Lemma 10. For any monotonic and continuous'^^ function f over expectations, 
and any e > 0, there is an expectation x such that 

X < f.x (22) 

and Ifp ■/ — e < x , (23) 

This is continuity in the usual sense in analysis; see Footnote l53l With monotonicity 
we have U-continuity for / as well. 
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where £ is the everywhere-e expectation. That is, we can find an almost-increased- 
by-f expectation x that approaches Ifp ./ as closely as we please from below. 

Proof. Define a subset T of the state space S by 

T - {s:5|/.0.s = lfp./.s} , (24) 

so that the subset T is "the termination set for f , " comprising those states at 
which f reaches its fixed-point in just one step. Because S is finite we can proceed 
by induction decreasing over these sets T determined by f , with the base case 
therefore being when T is all of S. 

We strengthen^^ Condition 1^2!^) of the inductive hypothesis to read 

Ifp ./ - e < X < Ifp ./ . (ESt) 

Ca.se T — S — Define x = (Ifp •/ — e) U so that (|23b .l is satisfied trivially. 
Since e > Q we have also that x <^ Ifp ./, and then from T = S and monotonicity 
of f we reason 

x « Ifp./ = /.O < f.x , 

sufficient for 

Case T <Z S — Pick s* from S ~T , and for all x define 

fy.X /.^[s*h->tf] ; 

that is the expectation that agrees with f.x everywhere except possibly at s* where 
it takes the value v instead.^^ Define also v* = Ifp./.s*, and note that v* > 
because otherwise we would have s* £ T.^^ 

We begin by showing two things about f*. The first (a) is that "the termina- 
tion set for f* " — that is T* = {s: S \ fy.O.s — Ifp .f*.s} — is a strict superset 
of T when v < v* , which will allow an appeal to the inductive hypothesis. The 
second (b) is that the function Ifp./* of v approaches Ifp./ as v approaches v* 
from below, and attains it in the limit. 

To show (a) we assume v < v* and note first that 

Ifp-/: < Ifp-/, (25) 

because 

f:.{\fp.f).s = /.(lfp./).s = lfp./.,s fors^s* 

and /*.(lfp ./).s* = V < Ifp./.s* by assumption. 

^■^ The extra condition x < Ifp ./ is used at Footnote 1521 in the subsidiary Lem. 1121 
below. 

In the following argument we hold s* fixed, which is why to avoid clutter we can 
omit it from the notation /*. We will however vary v. 
This fact is used at Footnote H7I 
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That establishes /*.(lfp./) < Ifp./, sufficient for fg,?)) by the least-fixed-point 
property 

From ilg,5|) we show T U {s*} C T* by considering two cases: 
Case s e T: We have 



< /*.(lfp ./*).s monotonicity 
= lfp./*.s fixed point 

< Ifp ./.S di) 
= /.O.s s € T 

/*.0.s. ST^s* 

Case s s*: We have 



/*.0.s* 

V 

/*.(lfp 

lfp./*.S* . fixed point 



V definition f* 

/*.(lfp ./*).s* again definition f* 



Thus when v < v* we have T d T U {s*} C T* , which establishes (a). 



To show (b) we note first that Ifp./^ is a continuous function of v as v 
increasesf^^ Thus we need only demonstrate that Ifp ./** = Ifp ./, since k25]) 
already shows that Ifp./^ is below Ifp./ for v < v* . Again we consider two cases 
(and appeal to \25]) itself in the second case): 

Case s^s*: We have f. {Ifp .f:,).s = .(Ifp = lfp./.*..s. 

Case s = s*: We have 



/•(lfP-/;.)-5* 

< /.(Ifp ■f)-S* by i25t in the special case v — v* 

— Ifp -f.S* fixed point 

= V* definition v* 

= /*..(lfp./*.).s* definition f*. 

= lfp./**.s* . fixed point 



Thus /.(Ifp./**) < Ifp./*,, whence by the least-fixed-point property we have 
Ifp./ < Ifp ■(/«•) which — with ^251) again in the case v = v* — gives the 
equality we need and establishes (b). 

The least-fixed-point property states that f.x < x impUes Ifp - f < x for any monotonic 
/ over a cpo. 

''^ This general result — continuity of fixed-points — requires in this case that /* is 
U-continuous in v and that each /* is itself U-continuous, the former trivial and the 
latter following from U-continuity of /. It gives continuity over directed sets of v, 
which we have because v is increasing. 
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With (a) and (h) secure, we proceed to the main proof: we make a particular 
choice of v and appeal to the induction hypothesis in respect of f* to find an 
expectation close to Ifp ./*, and we then show how to derive from that a suitable 
expectation x satisfying \2'i^2tl\a ) as required for the function f in this case. 

We choose v first. From (h) we can choose v < v* to achieve^'^ 



lfp./-£i < Ifp./: < Ifp./ (26) 

for any ei < e we please.'^^ 

Now we appeal to the induction hypothesis: since T G T* , for any < £2 < 
e ~ El we can find an x'^'^ satisfying 



Ifp./: 



and 



■ £2 



< 



< 



Ifp./: 



f* a;^^ 



(27) 
(28) 



From that we have immediately 
lfp./-e 

< Ifp./ — £1 — £2 choice of 62 

< Ifp./: ~ £2 left-hand inequality at \2b\l 

< x%^ left-hand inequality at \2T]/ 

< Ifp ./: right-hand inequality at \2T\j 

< Ifp ./ , left-hand inequality at I2b\} 

which is our Y2'J\a ) if we take x to he x%^ . 

All that remains is \2^) . for which we require x^^ <C f-X^^ — and indeed that 
holds trivially everywhere except possibly at s* : for if s ^ s* we have from \2^) 
that 

xl\s « f:.xl\s = f.xl\s. (29) 
Thus all we are left with is to show that x^^.s* <^ f.x^^^.s*, which unfortu- 



nately is not true for all x^J satisfying \2H2l^) . But, as we demonstrate in the 
technical Lem. \TB vroved in Aw. [Z1 below, for any 62 > it is possible to find an 
£2 with < £2 < £2 which retains the properties {21l2f^2^) above and satisfies 
Xv^ .s* ^ f.xt/' .s* as well — and which thus completes the proof. 



We can now sketch the proof of the main result of this section. 

Lemma 11. Fixed strategies suffice — For any formula (f>, possibly contain- 
ing strategy operators Fl/U, and valuation V, there are state-predicate tuples G/G 
— possibly depending on V — such that 

UgWv = HWv = ll<^llv . 

'^'^ Recall Footnote 1441 to see this is possible. In fact only v < v* is needed here, in the 
main proof; the strictness of the inequality is used at Footnote 1541 in Lem. lT^ below. 
Here we use finiteness of the state space, since the one ei applies for all states. 
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Proof, (sketch) We give only the ii-case of an otherwise straightforward induc- 
tion over the size of (j); a full proof may he found at 

Suppose we are considering the case where (j) is a least-fixed-point (iiX • ^). 
Let f be the function denoted by <P with respect to a single expectation-valued 
argument X supplied for the variable X, with the values of any other free variables 
in <P fixed by the environment V; for any G, G let functions /g- and f^ q be derived 
similarly from <P-q and q. 

Case G — We must show //./ — /i./g for some G;^^ note that /i./ < /i./c 
trivially, since / < /g • Since <P is smaller in size than <f>, our inductive hypothesis 
provides for any x a so that /g -x — f.x; take x = fj,.f and therefore choose 
G so that fQ.(fi.f) ~ f-ifJ'-f) = M-/- Thus ii.f is a fixed-point of fc, whence 
immediately fJ-.fo < fJ-.f. 

Case G — In this case must show /i./ = /i./g- for some G; again it is trivial 
that fJ-.f-Q < fJ..f for any G. 

For the other direction, in fact we show that for any £ > there is a G^ such 
that f-L-f-Q > H-f — s — whence the existence of a single G satisfying /i./g > /i./ 
follows from the finiteness of the state space (since the set of possible strategy 
tuples for this f is therefore finite as well, and so there must be one that works 
for all e). 

+ Because we know inductively that f is a minimax^" over strategy tuples G', G 
of almost-linear functions /,-, qI , that those functions are continuous by construc- 
tion, and that the minimax is finite because there are only finitely many strategy 
tuples G', G for this f , we know that f is continuous itself, and by Lem. \1U[ we 
therefore have an expectation x^ with 

pL.f — e < Xe and x^ ^ f .x^ . (30) 

To get our result we need only show in addition that x^ < H-f-Q for some Gg. 

From our inductive hypothesis we can choose G^ so that f-X^ = /g .Xg, 
whence from ^c/Uj) we have x^ <^ f-Q .x^. But in fact is ok (see below), so 
from Def.\^we have x^ < ^JL-f-Q and we are done. 



To see that is ok, we apply the argument of Case Q,^^ which gives us a 
G' with pi.f-Q — fJ^-f^i Q ■ Now consider any x such that x <^ f-Q .x . 

Since f-Q .x < /g, ^ .x we have x ^ /q, q .x also — but we recall from Lem.\^ 
that /g, Q is ok. Hence x < ^.fQ, q = fJ'-f'Q , and /g is ok as well. 

Note that {jiX ■ is the same as {/iX ■ <Pq) — it is syntactic substitution — so 
that /i.(/g) is indeed the correct denotation. 
An argument similar to that used in Lem.|S]makes this exphcit. 
®^ That argument makes an appeal to the inductive hypothesis in respect of , a 
smaller formula than (j>- Note however it is not a subformula of (j>, which is why we 
do not use structural induction. 
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Lemma 12. In this technical lemma we continue the notation established within 
the proof of Lem. \lfA we show that for any 62 > there is an with < £2 ^ ^2 
and an xt/' that together retain the properties \21l2^2l^) and in addition satisfy 
.Xy' .s* , as we require above. 

Proof. In fact we find £3 and x%^ which together satisfy the stronger property 

x7.s* < f .Xy .S . 

Suppose for a contradiction that, for all e'2 < £2, every x%^ we could choose 
satisfying Properties Wi^) for /*, that is \^^.fy — £2 < xl? < Ifp ./^ and 
x^v ^ fv-^v^ ' satisfied the inequality 

f.xi\s* < Xv\s* (31) 

as well — thus failing to have the property f28\) for f at s* that we needed to 
complete our proof of Lem. \10l As e'2 approaches zero^"^ we would then have 

from above that we can choose a sequence of xl^ 's approaching Ifp ./* — and 
so from the continuitf^^ of f we could take limits on both sides of giving 
/.(Ifp ./*).s* < lfp./*.s*, and so we would have 

F.v - /.(lfp./;).s* < lfp./;.s* = V, (32) 

where on the left we are defining a function F of v for use below — our con- 
tradiction will be achieved by considering a further property of F, beyond the 
F.v < V that we have at already. 

That property is v* < Ifp .F. To see that we argue by cases that 

/.(Ifp./ltp.^) - Ifp./itp.;^, 

which by the least-fixed-point property gives us Ifp./ < lfp-/|fp p- Then, applying 
that inequality at s* itself, we have Ifp./.s* < Ifp./fp p.s*, whence 

V 

= Ifp .f.s* definition v* 

< Ifp ./ifp j^.s* immediately above 

= /lfp.F-(lfP-/lfp.F)-S* fixed point 

= Ifp.J^ , definition: f^-y-s* = x, for all x,y 

as required. The two cases are 



This is where we use the strengthening of the inductive assumption, the upper bound 
on X in l|23b ,') : recall Footnote 1421 

This is where analytical — rather than U — continuity of / is used, since the sequence 
of a;^^'s is not necessarily increasing; recall Footnote 1411 
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Case s ^ s*: We have 




Case s = s*: We have 



/.(lfP-/^p.F)-S* 

i^.(lfp.F) 
Ifp.F 



definition F 
fixed point 

definition f* 
fixed point 



That establishes the equality we used above, and completes the demonstration 
that V* < Ifp .F. 

Now from that and v < v* we have immediatel'if'^ that v < Ifp .F also and, 
since F is monotonic (it is constructed from monotonic pieces), by the least- 
fixed-point property we have F.v ^ v — which contradicts Therefore our 
assumption must fail: there must he some < £3 ^ £2 for which not all choices 
of xt? satisfying \2'l\l satisfy |^J) as well — that is, at least one will satisfy 
at s* . That is the value we take. 



We prove in several stages that if / is almost-linear then / is ok in each argument 
separately, beginning with some preliminary lemmas. 

Lemma 13. Almost-feasibility — Let transformer f he almost-linear, and 
suppose for some state s that f.O.y. ■ ■ ■ .z.s — 0, where wlog we concentrate on 
the first argument of f. Then for any expectation x we have f-x.y. ■ • • .z.s < 
(Us: S • x.s). 

Proof. This is clear from the explicit form (refer Def\^ that almost-linear trans- 
formers take: if f.O.y. ■ ■ ■ .z.s is zero, then all non-x terms w.s, h.y.s, • • • , i.z.s in 
f must he zero — that is, for those values y, ■ ■ ■ , z, s we have f-x.y. ■ ■ ■ .z.s = g.x.s 
for some one-hounded linear transformer g, from which property of g we have 
g.x.s < (Us: S • x.s). 

From now on we will fix the non-a; arguments of /, and omit them for brevity. 

Lemma 14. Stationary zeroes — Let transformer f be almost-linear, and 
define its kernel K to he those states on which its fixed-point is zero: that is, 



D Proof of Lem. |H1 from App. iBl 



K = {s■.S\^JL.f.s = Q). 



This is where the strictness is used: recall Footnote 1471 
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Then in effect the probabilistic game f cannot escape from K : for any state 
k in K and expectation x, we have 

f.x.k = f.{x[K).k , 

where [x[K).s = x.s if seK else . 

Proof. Because fi.f is non-zero everywhere outside K , and our state space S is 
finite, there is an e > so that e x xl{S — K) < ii.f whence, for k in K , we 
have 

f.(exxi{S^K)).k < f.{ii.f).k = fi.f.k = 0. 

From that, the almost-linearity of f and that s > we have f.{xl{S — K)).k — 0. 
Again using almost-linearity, we continue 

"f.ixlK + xiiS ^ K)).k 
< f.{xlK).k + f.{xl(S — K)).k almost-linearity 

= f.(xlK).k . f.{xl{S~K)).k — 0, shown above 

The opposite inequality is immediate from monotonicity. 

Lemma 15. Almost-linear is almost ok — Let transformer f be almost-linear, 
and suppose for some expectation x that x <gC f.x . Then for all k in the kernel 
K of f we have x.k — 0. 

Proof. If X is not zero on K then K must be non-empty and there must be a state 
k* in K at which x attains a non-zero maximum {iAk:K • x.k). Then because 
X <C f.x we have x.k* < f.x.k* = f.{x[K).k* from Lem. 

Now f.O.k* < f.{fj..f ).k* = fi.f.k* = 0, since k* E K ,so that from Lem. POI 
we have as well that f .{x[K).k* < (Ufc: K • x.k) . Taken together with the above, 
that gives x.k* < (Ufc: K • x.k), contradicting the choice of k* . 



We can now proceed with the proof of Lem. |H1 we assume x ^ f.x. 
First choose a real scalar e > so that 

f .s s 

> for all s with x.s ^ 0, 

x.s £ + 1 

which is possible because S is finite, and note that {e + \){ii.f) — ex > Q then 
holds for all states — since when fi.f.s = we have from Lem. ^] that x.s = 
as well. In fact we can decrease e still further, if necessary, to achieve 

< {e + l){ti.f)~ex < 1, 

sufficient to use the expression as a one-bounded expectation in the argument 
below. 

Now because / is almost-linear, as a function of x it is of the form w + g.x 
for fixed expectation vu and linear g.^^ Applying / to our expression above, and 
using linearity of g, we have 

We absorb the fixed contributions h.y, ■ ■ ■ ,i.z of other arguments into w. 
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/.( {e + l)^x.f ~ ex) 
= w + g.{{e+l)fi.f - ex) 

= w + g.{ (e + ) — g-{ex) g linear 

w + (£ + 1)(5.(m./)) - 'e{g.x) 
= (e + l)w + (e+ - sw - s{g.x) 

{e + l){f.{p.f)) - eif.x) 
< (e + — ex , X < f.x, because x <§; f.x 

so showing that (e + 1)(/Lt./) — ex is a pre-least- fixed-point of /. 
Thus by the least-fixed-point property we have 

M-/ < ie + l){ii.f) - ex , 

whence by arithmetic (rearranging, dividing by e > 0) we have x < ^.f . 

Thus we have proved that all almost-linear transformers are ok. 



